24 research outputs found

    Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

    Get PDF
    The visual focus of attention (VFOA) has been recognized as a prominent conversational cue. We are interested in estimating and tracking the VFOAs associated with multi-party social interactions. We note that in this type of situations the participants either look at each other or at an object of interest; therefore their eyes are not always visible. Consequently both gaze and VFOA estimation cannot be based on eye detection and tracking. We propose a method that exploits the correlation between eye gaze and head movements. Both VFOA and gaze are modeled as latent variables in a Bayesian switching state-space model. The proposed formulation leads to a tractable learning procedure and to an efficient algorithm that simultaneously tracks gaze and visual focus. The method is tested and benchmarked using two publicly available datasets that contain typical multi-party human-robot and human-human interactions.Comment: 15 pages, 8 figures, 6 table

    Etude de la direction du regard dans le cadre d'interactions sociales incluant un robot

    No full text
    Robots are more and more used in a social context. They are required notonly to share physical space with humans but also to interact with them. Inthis context, the robot is expected to understand some verbal and non-verbalambiguous cues, constantly used in a natural human interaction. In particular,knowing who or what people are looking at is a very valuable information tounderstand each individual mental state as well as the interaction dynamics. Itis called Visual Focus of Attention or VFOA. In this thesis, we are interestedin using the inputs from an active humanoid robot – participating in a socialinteraction – to estimate who is looking at whom or what.On the one hand, we want the robot to look at people, so it can extractmeaningful visual information from its video camera. We propose a novelreinforcement learning method for robotic gaze control. The model is basedon a recurrent neural network architecture. The robot autonomously learns astrategy for moving its head (and camera) using audio-visual inputs. It is ableto focus on groups of people in a changing environment.On the other hand, information from the video camera images are used toinfer the VFOAs of people along time. We estimate the 3D head poses (lo-cation and orientation) for each face, as it is highly correlated with the gazedirection. We use it in two tasks. First, we note that objects may be lookedat while not being visible from the robot point of view. Under the assump-tion that objects of interest are being looked at, we propose to estimate theirlocations relying solely on the gaze direction of visible people. We formulatean ad hoc spatial representation based on probability heat-maps. We designseveral convolutional neural network models and train them to perform a re-gression from the space of head poses to the space of object locations. Thisprovide a set of object locations from a sequence of head poses. Second, wesuppose that the location of objects of interest are known. In this context, weintroduce a Bayesian probabilistic model, inspired from psychophysics, thatdescribes the dependency between head poses, object locations, eye-gaze di-rections, and VFOAs, along time. The formulation is based on a switchingstate-space Markov model. A specific filtering procedure is detailed to inferthe VFOAs, as well as an adapted training algorithm.The proposed contributions use data-driven approaches, and are addressedwithin the context of machine learning. All methods have been tested on pub-licly available datasets. Some training procedures additionally require to sim-ulate synthetic scenarios; the generation process is then explicitly detailed.Les robots sont de plus en plus utilisĂ©s dans un cadre social. Il ne suffit plusde partager l’espace avec des humains, mais aussi d’interagir avec eux. Dansce cadre, il est attendu du robot qu’il comprenne un certain nombre de signauxambiguĂ«s, verbaux et visuels, nĂ©cessaires Ă  une interaction humaine. En particulier, on peut extraire beaucoup d’information, Ă  la fois sur l’état d’esprit despersonnes et sur la dynamique de groupe Ă  l’Ɠuvre, en connaissant qui ou quoichaque personne regarde. On parle de la Cible d’attention visuelle, dĂ©signĂ©epar l’acronyme anglais VFOA. Dans cette thĂšse, nous nous intĂ©ressons auxdonnĂ©es perçues par un robot humanoı̈de qui participe activement Ă  une in-teraction sociale, et Ă  leur utilisation pour deviner ce que chaque personneregarde.D’une part, le robot doit “regarder les gens”, Ă  savoir orienter sa tĂȘte(et donc la camĂ©ra) pour obtenir des images des personnes prĂ©sentes. NousprĂ©sentons une mĂ©thode originale d’apprentissage par renforcement pourcontrĂŽler la direction du regard d’un robot. Cette mĂ©thode utilise des rĂ©seauxde neurones rĂ©currents. Le robot s’entraı̂ne en autonomie Ă  dĂ©placer sa tĂȘte enfonction des donnĂ©es visuelles et auditives. Il atteint une stratĂ©gie efficace, quilui permet de cibler des groupes de personnes dans un environnement Ă©volutif.D’autre part, les images du robot peuvent ĂȘtre utilisĂ©e pour estimer lesVFOAs au cours du temps. Pour chaque visage visible, nous calculons laposture 3D de la tĂȘte (position et orientation dans l’espace) car trĂšs fortementcorrĂ©lĂ©e avec la direction du regard. Nous l’utilisons dans deux applications.PremiĂšrement, nous remarquons que les gens peuvent regarder des objets quine sont pas visible depuis le point de vue du robot. Sous l’hypothĂšse quelesdits objets soient regardĂ©s au moins une partie du temps, nous souhaitonsestimer leurs positions exclusivement Ă  partir de la direction du regard despersonnes visibles. Nous utilisons une reprĂ©sentation sous forme de carte dechaleur. Nous avons Ă©laborĂ© et entraı̂nĂ© plusieurs rĂ©seaux de convolutions afinde d’estimer la rĂ©gression entre une sĂ©quence de postures des tĂȘtes, et les posi-tions des objets. Dans un second temps, les positions des objets d’intĂ©rĂȘt, pou-vant ĂȘtre ciblĂ©s, sont supposĂ©es connues. Nous prĂ©sentons alors un modĂšleprobabiliste, suggĂ©rĂ© par des rĂ©sultats en psychophysique, afin de modĂ©liserla relation entre les postures des tĂȘtes, les positions des objets, la directiondu regard et les VFOAs. La formulation utilise un modĂšle markovien Ă  dy-namiques multiples. En appliquant une approches bayĂ©sienne, nous obtenonsun algorithme pour calculer les VFOAs au fur et Ă  mesure, et une mĂ©thodepour estimer les paramĂštres du modĂšle.Nos contributions reposent sur la possibilitĂ© d’utiliser des donnĂ©es, afind’exploiter des approches d’apprentissage automatique. Toutes nos mĂ©thodessont validĂ©es sur des jeu de donnĂ©es disponibles publiquement. De plus, lagĂ©nĂ©ration de scĂ©narios synthĂ©tiques permet d’agrandir Ă  volontĂ© la quantitĂ©de donnĂ©es disponibles; les mĂ©thodes pour simuler ces donnĂ©es sont explicite-ment dĂ©taillĂ©e

    Gaze direction in the context of social human-robot interaction

    No full text
    Les robots sont de plus en plus utilisĂ©s dans un cadre social. Il ne suffit plusde partager l’espace avec des humains, mais aussi d’interagir avec eux. Dansce cadre, il est attendu du robot qu’il comprenne un certain nombre de signauxambiguĂ«s, verbaux et visuels, nĂ©cessaires Ă  une interaction humaine. En particulier, on peut extraire beaucoup d’information, Ă  la fois sur l’état d’esprit despersonnes et sur la dynamique de groupe Ă  l’Ɠuvre, en connaissant qui ou quoichaque personne regarde. On parle de la Cible d’attention visuelle, dĂ©signĂ©epar l’acronyme anglais VFOA. Dans cette thĂšse, nous nous intĂ©ressons auxdonnĂ©es perçues par un robot humanoı̈de qui participe activement Ă  une in-teraction sociale, et Ă  leur utilisation pour deviner ce que chaque personneregarde.D’une part, le robot doit “regarder les gens”, Ă  savoir orienter sa tĂȘte(et donc la camĂ©ra) pour obtenir des images des personnes prĂ©sentes. NousprĂ©sentons une mĂ©thode originale d’apprentissage par renforcement pourcontrĂŽler la direction du regard d’un robot. Cette mĂ©thode utilise des rĂ©seauxde neurones rĂ©currents. Le robot s’entraı̂ne en autonomie Ă  dĂ©placer sa tĂȘte enfonction des donnĂ©es visuelles et auditives. Il atteint une stratĂ©gie efficace, quilui permet de cibler des groupes de personnes dans un environnement Ă©volutif.D’autre part, les images du robot peuvent ĂȘtre utilisĂ©e pour estimer lesVFOAs au cours du temps. Pour chaque visage visible, nous calculons laposture 3D de la tĂȘte (position et orientation dans l’espace) car trĂšs fortementcorrĂ©lĂ©e avec la direction du regard. Nous l’utilisons dans deux applications.PremiĂšrement, nous remarquons que les gens peuvent regarder des objets quine sont pas visible depuis le point de vue du robot. Sous l’hypothĂšse quelesdits objets soient regardĂ©s au moins une partie du temps, nous souhaitonsestimer leurs positions exclusivement Ă  partir de la direction du regard despersonnes visibles. Nous utilisons une reprĂ©sentation sous forme de carte dechaleur. Nous avons Ă©laborĂ© et entraı̂nĂ© plusieurs rĂ©seaux de convolutions afinde d’estimer la rĂ©gression entre une sĂ©quence de postures des tĂȘtes, et les posi-tions des objets. Dans un second temps, les positions des objets d’intĂ©rĂȘt, pou-vant ĂȘtre ciblĂ©s, sont supposĂ©es connues. Nous prĂ©sentons alors un modĂšleprobabiliste, suggĂ©rĂ© par des rĂ©sultats en psychophysique, afin de modĂ©liserla relation entre les postures des tĂȘtes, les positions des objets, la directiondu regard et les VFOAs. La formulation utilise un modĂšle markovien Ă  dy-namiques multiples. En appliquant une approches bayĂ©sienne, nous obtenonsun algorithme pour calculer les VFOAs au fur et Ă  mesure, et une mĂ©thodepour estimer les paramĂštres du modĂšle.Nos contributions reposent sur la possibilitĂ© d’utiliser des donnĂ©es, afind’exploiter des approches d’apprentissage automatique. Toutes nos mĂ©thodessont validĂ©es sur des jeu de donnĂ©es disponibles publiquement. De plus, lagĂ©nĂ©ration de scĂ©narios synthĂ©tiques permet d’agrandir Ă  volontĂ© la quantitĂ©de donnĂ©es disponibles; les mĂ©thodes pour simuler ces donnĂ©es sont explicite-ment dĂ©taillĂ©e.Robots are more and more used in a social context. They are required notonly to share physical space with humans but also to interact with them. Inthis context, the robot is expected to understand some verbal and non-verbalambiguous cues, constantly used in a natural human interaction. In particular,knowing who or what people are looking at is a very valuable information tounderstand each individual mental state as well as the interaction dynamics. Itis called Visual Focus of Attention or VFOA. In this thesis, we are interestedin using the inputs from an active humanoid robot – participating in a socialinteraction – to estimate who is looking at whom or what.On the one hand, we want the robot to look at people, so it can extractmeaningful visual information from its video camera. We propose a novelreinforcement learning method for robotic gaze control. The model is basedon a recurrent neural network architecture. The robot autonomously learns astrategy for moving its head (and camera) using audio-visual inputs. It is ableto focus on groups of people in a changing environment.On the other hand, information from the video camera images are used toinfer the VFOAs of people along time. We estimate the 3D head poses (lo-cation and orientation) for each face, as it is highly correlated with the gazedirection. We use it in two tasks. First, we note that objects may be lookedat while not being visible from the robot point of view. Under the assump-tion that objects of interest are being looked at, we propose to estimate theirlocations relying solely on the gaze direction of visible people. We formulatean ad hoc spatial representation based on probability heat-maps. We designseveral convolutional neural network models and train them to perform a re-gression from the space of head poses to the space of object locations. Thisprovide a set of object locations from a sequence of head poses. Second, wesuppose that the location of objects of interest are known. In this context, weintroduce a Bayesian probabilistic model, inspired from psychophysics, thatdescribes the dependency between head poses, object locations, eye-gaze di-rections, and VFOAs, along time. The formulation is based on a switchingstate-space Markov model. A specific filtering procedure is detailed to inferthe VFOAs, as well as an adapted training algorithm.The proposed contributions use data-driven approaches, and are addressedwithin the context of machine learning. All methods have been tested on pub-licly available datasets. Some training procedures additionally require to sim-ulate synthetic scenarios; the generation process is then explicitly detailed

    Etude de la direction du regard dans le cadre d'interactions sociales incluant un robot

    No full text
    Robots are more and more used in a social context. They are required notonly to share physical space with humans but also to interact with them. Inthis context, the robot is expected to understand some verbal and non-verbalambiguous cues, constantly used in a natural human interaction. In particular,knowing who or what people are looking at is a very valuable information tounderstand each individual mental state as well as the interaction dynamics. Itis called Visual Focus of Attention or VFOA. In this thesis, we are interestedin using the inputs from an active humanoid robot – participating in a socialinteraction – to estimate who is looking at whom or what.On the one hand, we want the robot to look at people, so it can extractmeaningful visual information from its video camera. We propose a novelreinforcement learning method for robotic gaze control. The model is basedon a recurrent neural network architecture. The robot autonomously learns astrategy for moving its head (and camera) using audio-visual inputs. It is ableto focus on groups of people in a changing environment.On the other hand, information from the video camera images are used toinfer the VFOAs of people along time. We estimate the 3D head poses (lo-cation and orientation) for each face, as it is highly correlated with the gazedirection. We use it in two tasks. First, we note that objects may be lookedat while not being visible from the robot point of view. Under the assump-tion that objects of interest are being looked at, we propose to estimate theirlocations relying solely on the gaze direction of visible people. We formulatean ad hoc spatial representation based on probability heat-maps. We designseveral convolutional neural network models and train them to perform a re-gression from the space of head poses to the space of object locations. Thisprovide a set of object locations from a sequence of head poses. Second, wesuppose that the location of objects of interest are known. In this context, weintroduce a Bayesian probabilistic model, inspired from psychophysics, thatdescribes the dependency between head poses, object locations, eye-gaze di-rections, and VFOAs, along time. The formulation is based on a switchingstate-space Markov model. A specific filtering procedure is detailed to inferthe VFOAs, as well as an adapted training algorithm.The proposed contributions use data-driven approaches, and are addressedwithin the context of machine learning. All methods have been tested on pub-licly available datasets. Some training procedures additionally require to sim-ulate synthetic scenarios; the generation process is then explicitly detailed.Les robots sont de plus en plus utilisĂ©s dans un cadre social. Il ne suffit plusde partager l’espace avec des humains, mais aussi d’interagir avec eux. Dansce cadre, il est attendu du robot qu’il comprenne un certain nombre de signauxambiguĂ«s, verbaux et visuels, nĂ©cessaires Ă  une interaction humaine. En particulier, on peut extraire beaucoup d’information, Ă  la fois sur l’état d’esprit despersonnes et sur la dynamique de groupe Ă  l’Ɠuvre, en connaissant qui ou quoichaque personne regarde. On parle de la Cible d’attention visuelle, dĂ©signĂ©epar l’acronyme anglais VFOA. Dans cette thĂšse, nous nous intĂ©ressons auxdonnĂ©es perçues par un robot humanoı̈de qui participe activement Ă  une in-teraction sociale, et Ă  leur utilisation pour deviner ce que chaque personneregarde.D’une part, le robot doit “regarder les gens”, Ă  savoir orienter sa tĂȘte(et donc la camĂ©ra) pour obtenir des images des personnes prĂ©sentes. NousprĂ©sentons une mĂ©thode originale d’apprentissage par renforcement pourcontrĂŽler la direction du regard d’un robot. Cette mĂ©thode utilise des rĂ©seauxde neurones rĂ©currents. Le robot s’entraı̂ne en autonomie Ă  dĂ©placer sa tĂȘte enfonction des donnĂ©es visuelles et auditives. Il atteint une stratĂ©gie efficace, quilui permet de cibler des groupes de personnes dans un environnement Ă©volutif.D’autre part, les images du robot peuvent ĂȘtre utilisĂ©e pour estimer lesVFOAs au cours du temps. Pour chaque visage visible, nous calculons laposture 3D de la tĂȘte (position et orientation dans l’espace) car trĂšs fortementcorrĂ©lĂ©e avec la direction du regard. Nous l’utilisons dans deux applications.PremiĂšrement, nous remarquons que les gens peuvent regarder des objets quine sont pas visible depuis le point de vue du robot. Sous l’hypothĂšse quelesdits objets soient regardĂ©s au moins une partie du temps, nous souhaitonsestimer leurs positions exclusivement Ă  partir de la direction du regard despersonnes visibles. Nous utilisons une reprĂ©sentation sous forme de carte dechaleur. Nous avons Ă©laborĂ© et entraı̂nĂ© plusieurs rĂ©seaux de convolutions afinde d’estimer la rĂ©gression entre une sĂ©quence de postures des tĂȘtes, et les posi-tions des objets. Dans un second temps, les positions des objets d’intĂ©rĂȘt, pou-vant ĂȘtre ciblĂ©s, sont supposĂ©es connues. Nous prĂ©sentons alors un modĂšleprobabiliste, suggĂ©rĂ© par des rĂ©sultats en psychophysique, afin de modĂ©liserla relation entre les postures des tĂȘtes, les positions des objets, la directiondu regard et les VFOAs. La formulation utilise un modĂšle markovien Ă  dy-namiques multiples. En appliquant une approches bayĂ©sienne, nous obtenonsun algorithme pour calculer les VFOAs au fur et Ă  mesure, et une mĂ©thodepour estimer les paramĂštres du modĂšle.Nos contributions reposent sur la possibilitĂ© d’utiliser des donnĂ©es, afind’exploiter des approches d’apprentissage automatique. Toutes nos mĂ©thodessont validĂ©es sur des jeu de donnĂ©es disponibles publiquement. De plus, lagĂ©nĂ©ration de scĂ©narios synthĂ©tiques permet d’agrandir Ă  volontĂ© la quantitĂ©de donnĂ©es disponibles; les mĂ©thodes pour simuler ces donnĂ©es sont explicite-ment dĂ©taillĂ©e

    Importance of ice algal production for top predators: new insights using sea-ice biomarkers

    No full text
    International audienceAntarctic seals and seabirds are strongly dependent on sea-ice cover to complete their life history. In polar ecosystems, sea ice provides a habitat for ice-associated diatoms that ensures a substantial production of organic matter. Recent studies have presented the potential of highly branched isoprenoids (HBIs) for tracing carbon flows from ice algae to higher-trophic-level organisms. However, to our knowledge, this new method has never been applied to sub-Antarctic species and Antarctic seals. Moreover, seasonal variations in HBI levels have never been investigated in Antarctic predators, despite a likely shift in food source from ice-derived to pelagic organic matter after sea-ice retreat. In the present study, we described HBI levels in a community of seabirds and seals breeding in Adélie Land, Antarctica. We then validated that sub-Antarctic seabirds had lower levels of diene, a HBI of sea-ice diatom origin, and higher levels of triene, a HBI of phytoplanktonic origin, compared with Antarctic seabirds. Finally, we explored temporal changes in HBI levels after the ice break up in summer. The level of diene relative to triene in Adélie penguin chicks increased and then declined during the breeding season, which was consistent with the short and intense proliferation of sea-ice algae in spring, followed by the pelagic phytoplankton bloom in summer. HBI biomarkers in Antarctic seabirds and seals thus indicate a shift from ice-algal derived organic matter to a pelagic carbon source during the summer breeding season

    3D Anthropometry In Ergonomic Product Design Education

    No full text
    The Faculty of Industrial Design Engineering of the Delft University of Technology offers a bachelor’s degree education programme and three master’s programmes. Our students are lectured in ergonomics and learn to design and conduct research in ergonomics. In this paper we describe the development of methods to realise ergonomic fit mapping based on 3D anthropometrics and to educate students on this topic. Due to the increasing availability of 3D scan data, we enter the complex field of 3D anthropometry and statistical shape models, which is an increasingly popular mathematical representation for 3D human shape variation. These facilities and knowledge are particularly useful when it comes to products that should fit close to the human body. The use of 3D anthropometrics is explained and practiced throughout the different stages of complexity. It starts with the use of 1D and 2D anthropometric data, the application of percentiles and the DINED tool Ellipse to see the correlation between two different body dimensions and to determine the consequences for related product dimensions. It ends with the use of 3D anthropometric data for the design of a helmet for cyclists, by way of bi-variate based shape analysis of the head. We made efforts to lower the burden for students working with 3D scan data, for example by providing pre-processed 3D scan databases and casus specific measurement tables.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Applied Ergonomics and Desig

    Pancreas collagen digestion during islet of Langerhans isolation

    No full text
    The success of pancreas islet isolation largely depends on donor characteristics, including extracellular matrix composition of which collagen is the main element. We hypothesized that isolation yields are proportional to collagen digestion percentage, and aimed to determine a threshold that predicts isolation success

    3D Anthropometry In Ergonomic Product Design Education

    No full text
    The Faculty of Industrial Design Engineering of the Delft University of Technology offers a bachelor’s degree education programme and three master’s programmes. Our students are lectured in ergonomics and learn to design and conduct research in ergonomics. In this paper we describe the development of methods to realise ergonomic fit mapping based on 3D anthropometrics and to educate students on this topic. Due to the increasing availability of 3D scan data, we enter the complex field of 3D anthropometry and statistical shape models, which is an increasingly popular mathematical representation for 3D human shape variation. These facilities and knowledge are particularly useful when it comes to products that should fit close to the human body. The use of 3D anthropometrics is explained and practiced throughout the different stages of complexity. It starts with the use of 1D and 2D anthropometric data, the application of percentiles and the DINED tool Ellipse to see the correlation between two different body dimensions and to determine the consequences for related product dimensions. It ends with the use of 3D anthropometric data for the design of a helmet for cyclists, by way of bi-variate based shape analysis of the head. We made efforts to lower the burden for students working with 3D scan data, for example by providing pre-processed 3D scan databases and casus specific measurement tables.</p
    corecore